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Randomise the list 



Number list 



Annotation 










Name Chromosome Copies Start 


Stop 


CYP2D6 


3 


1 


56 


1065 


ACOl 


5 


1 


7865 


8763 


HIA1 


12 


1 


12 


2000 


ABCA6 


X 


1 


5748 


6003 




r 


Annotation 










Name Chromosome 


Copies Start 


Stop 


1 CYP2D6 


3 


1 


56 


1065 


2 ACOl 


5 


1 


7865 


8763 


3 HIA1 


12 


1 


12 


2000 


4 ABCA6 


X 


1 


5748 


6003 



Separate the datasets of 
gene names and gene 
annotation information. 



Gene names 

1 CYP2D6 

2 ACOl 

3 HIA1 

4 ABCA6 



Gene data 


1 


3, 1, 56, 1065 


2 


5, 1, 7865, 8763 


3 


12, 1, 12, 2000 


4 


x, 1, 5748, 6003 



Figure 1 
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Sequence: 
acaaaccaca 



Read n characters 



Convert the string of 
characters to a symbol by 
using a lookup table. 




I 



NO 



Number each symbol. 



1 



Shuffle the sequence of 
numbered symbols 



Split symbols and 
numbering 



n=2 

ac aa ac ca ca 



Lookup : aa = w, 

ac = x, 



cc = y, 
ca = z 



ac aa ac ca ca = x w x z z 



xl w2 x3 z4 z5 



x3 z5 z4 xl w2 




Dataset 1 Dataset 2 

3 5 4 1 2 xzzxw 



Figure 2 
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Identify gene of 
interest 



Identify annotation 
for gene of interest 



Determine sequence 
fragment identity and 
retrieve datasets 



Combine datasets 



Unrandomise the 
sequence of numbered 
symbols 



Convert the string of 
symbols to characters by 
using a lookup table. 



Identify sequences of 
interest (eg. start and end 
of gene) using annotation. 



B 



HIA1 



Name Chromosome Copies Seq.Frag ID. Start Stop 
HIA1 12 1 94 10 1998 



1 



Dataset 94a 


Dataset 94b 


3 54 1 2 


xzzxw 



J 



x3 z5 z4 xl w2 



xl w2 x3 z4 z5 



Lookup : w = aa, y = cc 
x = ac 5 z = ca 

x w x z z = ac aa ac ca ca 



Sequence: 


Start 


Stop 


acaaaccaca 


10 


1998 



Figure 3 



